Overview

Dataset statistics

Number of variables7
Number of observations768
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory42.1 KiB
Average record size in memory56.2 B

Variable types

Numeric6
Categorical1

Alerts

BloodPressure has 35 (4.6%) zerosZeros
Insulin has 374 (48.7%) zerosZeros
BMI has 11 (1.4%) zerosZeros

Reproduction

Analysis started2024-05-24 14:00:54.735392
Analysis finished2024-05-24 14:01:03.539241
Duration8.8 seconds
Software versionydata-profiling vv4.7.0
Download configurationconfig.json

Variables

Glucose
Real number (ℝ)

Distinct136
Distinct (%)17.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean120.89453
Minimum0
Maximum199
Zeros5
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2024-05-24T19:31:03.755714image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile79
Q199
median117
Q3140.25
95-th percentile181
Maximum199
Range199
Interquartile range (IQR)41.25

Descriptive statistics

Standard deviation31.972618
Coefficient of variation (CV)0.26446703
Kurtosis0.64077982
Mean120.89453
Median Absolute Deviation (MAD)20
Skewness0.1737535
Sum92847
Variance1022.2483
MonotonicityNot monotonic
2024-05-24T19:31:04.067202image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99 17
 
2.2%
100 17
 
2.2%
111 14
 
1.8%
129 14
 
1.8%
125 14
 
1.8%
106 14
 
1.8%
112 13
 
1.7%
108 13
 
1.7%
95 13
 
1.7%
105 13
 
1.7%
Other values (126) 626
81.5%
ValueCountFrequency (%)
0 5
0.7%
44 1
 
0.1%
56 1
 
0.1%
57 2
 
0.3%
61 1
 
0.1%
62 1
 
0.1%
65 1
 
0.1%
67 1
 
0.1%
68 3
0.4%
71 4
0.5%
ValueCountFrequency (%)
199 1
 
0.1%
198 1
 
0.1%
197 4
0.5%
196 3
0.4%
195 2
0.3%
194 3
0.4%
193 2
0.3%
191 1
 
0.1%
190 1
 
0.1%
189 4
0.5%

BloodPressure
Real number (ℝ)

ZEROS 

Distinct47
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.105469
Minimum0
Maximum122
Zeros35
Zeros (%)4.6%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2024-05-24T19:31:04.358791image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile38.7
Q162
median72
Q380
95-th percentile90
Maximum122
Range122
Interquartile range (IQR)18

Descriptive statistics

Standard deviation19.355807
Coefficient of variation (CV)0.28009082
Kurtosis5.1801566
Mean69.105469
Median Absolute Deviation (MAD)8
Skewness-1.843608
Sum53073
Variance374.64727
MonotonicityNot monotonic
2024-05-24T19:31:04.667407image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
70 57
 
7.4%
74 52
 
6.8%
78 45
 
5.9%
68 45
 
5.9%
72 44
 
5.7%
64 43
 
5.6%
80 40
 
5.2%
76 39
 
5.1%
60 37
 
4.8%
0 35
 
4.6%
Other values (37) 331
43.1%
ValueCountFrequency (%)
0 35
4.6%
24 1
 
0.1%
30 2
 
0.3%
38 1
 
0.1%
40 1
 
0.1%
44 4
 
0.5%
46 2
 
0.3%
48 5
 
0.7%
50 13
 
1.7%
52 11
 
1.4%
ValueCountFrequency (%)
122 1
 
0.1%
114 1
 
0.1%
110 3
0.4%
108 2
0.3%
106 3
0.4%
104 2
0.3%
102 1
 
0.1%
100 3
0.4%
98 3
0.4%
96 4
0.5%

Insulin
Real number (ℝ)

ZEROS 

Distinct186
Distinct (%)24.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean79.799479
Minimum0
Maximum846
Zeros374
Zeros (%)48.7%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2024-05-24T19:31:04.945341image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median30.5
Q3127.25
95-th percentile293
Maximum846
Range846
Interquartile range (IQR)127.25

Descriptive statistics

Standard deviation115.244
Coefficient of variation (CV)1.4441699
Kurtosis7.2142596
Mean79.799479
Median Absolute Deviation (MAD)30.5
Skewness2.2722509
Sum61286
Variance13281.18
MonotonicityNot monotonic
2024-05-24T19:31:05.226634image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 374
48.7%
105 11
 
1.4%
130 9
 
1.2%
140 9
 
1.2%
120 8
 
1.0%
94 7
 
0.9%
180 7
 
0.9%
100 7
 
0.9%
135 6
 
0.8%
115 6
 
0.8%
Other values (176) 324
42.2%
ValueCountFrequency (%)
0 374
48.7%
14 1
 
0.1%
15 1
 
0.1%
16 1
 
0.1%
18 2
 
0.3%
22 1
 
0.1%
23 2
 
0.3%
25 1
 
0.1%
29 1
 
0.1%
32 1
 
0.1%
ValueCountFrequency (%)
846 1
0.1%
744 1
0.1%
680 1
0.1%
600 1
0.1%
579 1
0.1%
545 1
0.1%
543 1
0.1%
540 1
0.1%
510 1
0.1%
495 2
0.3%

BMI
Real number (ℝ)

ZEROS 

Distinct248
Distinct (%)32.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.992578
Minimum0
Maximum67.1
Zeros11
Zeros (%)1.4%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2024-05-24T19:31:06.473605image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile21.8
Q127.3
median32
Q336.6
95-th percentile44.395
Maximum67.1
Range67.1
Interquartile range (IQR)9.3

Descriptive statistics

Standard deviation7.8841603
Coefficient of variation (CV)0.24643717
Kurtosis3.2904429
Mean31.992578
Median Absolute Deviation (MAD)4.6
Skewness-0.42898159
Sum24570.3
Variance62.159984
MonotonicityNot monotonic
2024-05-24T19:31:06.763524image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
32 13
 
1.7%
31.6 12
 
1.6%
31.2 12
 
1.6%
0 11
 
1.4%
32.4 10
 
1.3%
33.3 10
 
1.3%
30.1 9
 
1.2%
32.8 9
 
1.2%
32.9 9
 
1.2%
30.8 9
 
1.2%
Other values (238) 664
86.5%
ValueCountFrequency (%)
0 11
1.4%
18.2 3
 
0.4%
18.4 1
 
0.1%
19.1 1
 
0.1%
19.3 1
 
0.1%
19.4 1
 
0.1%
19.5 2
 
0.3%
19.6 3
 
0.4%
19.9 1
 
0.1%
20 1
 
0.1%
ValueCountFrequency (%)
67.1 1
0.1%
59.4 1
0.1%
57.3 1
0.1%
55 1
0.1%
53.2 1
0.1%
52.9 1
0.1%
52.3 2
0.3%
50 1
0.1%
49.7 1
0.1%
49.6 1
0.1%

DiabetesPedigreeFunction
Real number (ℝ)

Distinct517
Distinct (%)67.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4718763
Minimum0.078
Maximum2.42
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2024-05-24T19:31:07.037561image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Quantile statistics

Minimum0.078
5-th percentile0.14035
Q10.24375
median0.3725
Q30.62625
95-th percentile1.13285
Maximum2.42
Range2.342
Interquartile range (IQR)0.3825

Descriptive statistics

Standard deviation0.3313286
Coefficient of variation (CV)0.70215138
Kurtosis5.5949535
Mean0.4718763
Median Absolute Deviation (MAD)0.1675
Skewness1.9199111
Sum362.401
Variance0.10977864
MonotonicityNot monotonic
2024-05-24T19:31:07.318811image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.258 6
 
0.8%
0.254 6
 
0.8%
0.268 5
 
0.7%
0.207 5
 
0.7%
0.261 5
 
0.7%
0.259 5
 
0.7%
0.238 5
 
0.7%
0.19 4
 
0.5%
0.263 4
 
0.5%
0.299 4
 
0.5%
Other values (507) 719
93.6%
ValueCountFrequency (%)
0.078 1
0.1%
0.084 1
0.1%
0.085 2
0.3%
0.088 2
0.3%
0.089 1
0.1%
0.092 1
0.1%
0.096 1
0.1%
0.1 1
0.1%
0.101 1
0.1%
0.102 1
0.1%
ValueCountFrequency (%)
2.42 1
0.1%
2.329 1
0.1%
2.288 1
0.1%
2.137 1
0.1%
1.893 1
0.1%
1.781 1
0.1%
1.731 1
0.1%
1.699 1
0.1%
1.698 1
0.1%
1.6 1
0.1%

Age
Real number (ℝ)

Distinct52
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.240885
Minimum21
Maximum81
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2024-05-24T19:31:07.584759image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Quantile statistics

Minimum21
5-th percentile21
Q124
median29
Q341
95-th percentile58
Maximum81
Range60
Interquartile range (IQR)17

Descriptive statistics

Standard deviation11.760232
Coefficient of variation (CV)0.35378816
Kurtosis0.64315889
Mean33.240885
Median Absolute Deviation (MAD)7
Skewness1.1295967
Sum25529
Variance138.30305
MonotonicityNot monotonic
2024-05-24T19:31:07.891985image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
22 72
 
9.4%
21 63
 
8.2%
25 48
 
6.2%
24 46
 
6.0%
23 38
 
4.9%
28 35
 
4.6%
26 33
 
4.3%
27 32
 
4.2%
29 29
 
3.8%
31 24
 
3.1%
Other values (42) 348
45.3%
ValueCountFrequency (%)
21 63
8.2%
22 72
9.4%
23 38
4.9%
24 46
6.0%
25 48
6.2%
26 33
4.3%
27 32
4.2%
28 35
4.6%
29 29
3.8%
30 21
 
2.7%
ValueCountFrequency (%)
81 1
 
0.1%
72 1
 
0.1%
70 1
 
0.1%
69 2
0.3%
68 1
 
0.1%
67 3
0.4%
66 4
0.5%
65 3
0.4%
64 1
 
0.1%
63 4
0.5%

Outcome
Categorical

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size6.1 KiB
0
500 
1
268 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters768
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Length

2024-05-24T19:31:08.142219image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-24T19:31:08.465536image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Most occurring characters

ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 768
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Most occurring scripts

ValueCountFrequency (%)
Common 768
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 768
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Interactions

2024-05-24T19:31:01.747828image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:55.390612image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:56.732757image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:57.964358image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:59.240584image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:31:00.513125image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:31:01.968749image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:55.645543image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:56.939273image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:58.172181image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:59.450739image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:31:00.719689image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:31:02.183799image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:55.851418image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:57.152849image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:58.373770image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:59.656347image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:31:00.943167image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:31:02.355887image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:56.055947image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:57.340410image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:58.597220image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:59.851018image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:31:01.128354image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:31:02.597745image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:56.273184image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:57.548115image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:58.835711image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:31:00.072435image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:31:01.335002image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:31:02.826161image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:56.505124image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:57.756711image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:30:59.040905image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:31:00.285793image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2024-05-24T19:31:01.539355image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Correlations

2024-05-24T19:31:08.622226image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
AgeBMIBloodPressureDiabetesPedigreeFunctionGlucoseInsulinOutcome
Age1.0000.1310.3510.0430.285-0.1140.314
BMI0.1311.0000.2930.1410.2310.1930.317
BloodPressure0.3510.2931.0000.0300.235-0.0070.152
DiabetesPedigreeFunction0.0430.1410.0301.0000.0910.2210.173
Glucose0.2850.2310.2350.0911.0000.2130.487
Insulin-0.1140.193-0.0070.2210.2131.0000.159
Outcome0.3140.3170.1520.1730.4870.1591.000

Missing values

2024-05-24T19:31:03.106082image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-24T19:31:03.422391image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

GlucoseBloodPressureInsulinBMIDiabetesPedigreeFunctionAgeOutcome
014872033.60.627501
18566026.60.351310
218364023.30.672321
389669428.10.167210
41374016843.12.288331
511674025.60.201300
678508831.00.248261
71150035.30.134290
81977054330.50.158531
91259600.00.232541
GlucoseBloodPressureInsulinBMIDiabetesPedigreeFunctionAgeOutcome
75810676037.50.197260
75919092035.50.278661
76088581628.40.766220
76117074044.00.403431
7628962022.50.142330
7631017618032.90.171630
76412270036.80.340270
7651217211226.20.245300
76612660030.10.349471
7679370030.40.315230